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For a Markov chain X = {Xi,i — 1,2, ... ,n} with the state space {0, 1}, the random variable 
S := ^"=1 is said to follow a Markov binomial distribution. The exact distribution of S, 
denoted jCS, is very computationally intensive for large n (see Gabriel [Biometrika 46 (1959) 
454-460] and Bhat and Lai [Adv. in Appl. Probab. 20 (1988) 677-680]) and this paper concerns 
suitable approximate distributions for LS when X is stationary. We conclude that the negative 
binomial and binomial distributions are appropriate approximations for LS when Var S is greater 
than and less than E5", respectively. Also, due to the unique structure of the distribution, we 
are able to derive explicit error estimates for these approximations. 

Keywords: binomial distribution; coupling; Markov binomial distribution; negative binomial 
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1. Introduction and the main results 

Let X = {Xi, i = 1, 2, . . .,n} be a Markov chain with the state space {0, 1} and transition 
matrix 



where a.,/3 € (0,1). The distribution of S := X^Lt ^-S; denoted CS, is well known as 
the Markov binomial distribution. When X is stationary and a = (3, CS degenerates 
to a binomial distribution. Except for the case a = (3, the exact distribution of S (see 
Gabriel (1959) and Bhat and Lai (1988)) is very computationally intensive for large n 
and our interest is in investigating suitable approximate distributions for CS. 

It appears that Koopman (1950) and Dobrushin (1961) were among the earliest in the 
study of limit theory of Markov binomial distributions and the topic was then treated in 
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many articles including Serfling (1975), Wang (1981), Serfozo (1986), He and Xia (1997), 
Cekanavicius and Mikalauskas (1999), Vcllaisamy and Chaudhuri (1999), Barbour and 
Lindvall (2006), Cekanavicius and Roos (2007). The approximate distributions considered 
are mostly normal, compound Poisson, translated Poisson or binomial distributions. For 
instance, when na/(l — (3 + a) converges, Wang (1981) proved that for any fixed k, 
P(S = k) converges to ¥(Y = k), where Y is a compound Poisson variable. Barbour and 
Lindvall (2006) used a translated Poisson distribution to approximate the distribution 
of a sum of integer-valued random variables whose distributions depend on the state 
of an underlying Markov chain. Under an aperiodic condition, they established error 
bounds with respect to the total variation distance, comparable to those found for normal 
approximation with respect to the weaker Kolmogorov distance. On the other hand, when 
the first two factorial cumulants of CS are matched by those of a binomial distribution, 
Cekanavicius and Roos (2007) demonstrated that the binomial distribution is a suitable 
approximation for CS with an approximation error, measured in total variation norm, in 
the order of The error estimates in Barbour and Lindvall (2006) and Cekanavicius 
and Roos (2007) are of the best possible order. 

The main purpose of this paper is to find suitable approximate distributions for CS 
and provide error bounds as explicit functions of the parameters of the Markov binomial 
distribution. We will show that the negative binomial and binomial distributions are 
suitable approximations when VarS 1 is greater than and less than ES*, respectively. We 
employ the celebrated Stein method for binomial (Ehm (1991)) and negative binomial 
(Brown and Phillips (1999)) approximations and use the unique structure of the Markov 
binomial distribution to construct a suitable coupling which enables us to specify all of 
the constants involved in the estimates. 

For convenience, from now on, we will assume that X is stationary. Direct computation 
ensures that the stationary distribution 7r of X is 

p:=7r(l) = - 5—, tt(0) 



and 



l-/3 + a v ' l-/3 + a 

ES 1 = np, 

VarS 1 = np(l - p) + nA —A\+A\(j} — a)™, (1.2) 



where 



Ml-fflM , 2a(l-/3)(/3-a) 

(i-/? + a )3 < {1-p + aY ■ (L3) 

Note that X is a stationary positive recurrent Markov chain. 

To state the main result, we use Bi(m,#) to stand for the binomial distribution with 
parameters m and < 8 < 1 . We say that Y follows the negative binomial distribution 
with parameters r > and < q < 1, denoted by NB(r, q), if 
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The metric we will use for measuring the approximation errors is the total variation 
distance defined as 

drv(P,Q):= sup \P(A) - Q{A)\ 

ACZ+ 

for probability distributions P, Q on Z + . 

For the Markov chain X with transition matrix (1.1), we set 



1 - a 



1-a 



Mi 
Co 



a ol 
|/3-a|(5 + 43aV/3) 



(l-/3Va) 2 



fJ-2 

Ci = 



Ml + M2 + ' 



min(l -Q!,j8, 1/2)' 



A' 2 = 



1-/3' 

10(/3Va) 
l-/3Va' 

90(af +a 2 2 ) 

M1+M2+2' 



(1-/3)2' 

_ (l-p)(5 + 23aVj8) 
~ (l-aV/3) 2 ' 



It is worthwhile to note that (resp., /i 2 ) is the mean number of revisits of O's (resp., 
l's) before the Markov chain moves to state 1 (resp., 0), and u\ and a\ are the variances 
of the corresponding variables. The main result of the paper is as follows. 

Theorem 1.1. 

1. IfVaxS>ES, then 



where 



dry(CS,NB(r,q))<C 
(ES) 2 



2K X 4K 2 



Ln/4j 



(1.4) 



ES 



q = 



VarS-ES" * Var S 

and NB(oo, 1) is understood as the Poisson distribution with parameter ES . 
2. IfV&rS <ES, then 



d TY (£(S),Bi(m,6)) 

< 



\p-o\ r , |fl-a| „ 



1-6 



2K X 4K 2 



+ (/3Va) Ln/4j 



9 2 (rh — m) 
np(l-0) ' 



(1.5) 



where 



(ES) 5 



' ES-VarS" 
and \fh\ is the integer part ofrh. 



m = \fh\ . 



= 



np 



Remark 1.1. In practical situations, a and (3 are usually fixed, so the bounds in The- 
orem 1.1 arc of order -^=. The constants Ki and Ci are useful when both a and (i arc 
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a reasonable distance from and 1. If a is close to and /3 is close to 1, then CS is 
not unimodal, so one should not expect good approximation by a negative binomial or 
binomial distribution. On the other hand, when a is close to 1 and /? is close to 0, MS 
is close to 7T, but VarS 1 will be close to for even n and \ for odd n, meaning that we 
should not expect a good binomial approximation in this case either since the accuracy 
of approximation is a function of VarS". If both a and f3 are close to 0, then Poisson 
approximation to CS (see Barbour et al. (1992), Theorem 8.H) is generally sufficient. If 
both a and {3 are close to 1, one should consider approximating C(n — S) instead of CS. 

Remark 1.2. Except when both a and /? are very small, Poisson approximation to CS 
(see Barbour et al. (1992), Theorem 8.H) is inadequate since the error bound of Poisson 
approximation will not become small when n becomes large. 

Remark 1.3. Lemma 2.2, proved in the next section, states that a necessary condition 
for (1.4) is that fj > a. 

Remark 1.4- It is easy to see that if Aq > p 2 , then VarS > ES for sufficiently large n. 

2 

In this case, as n -> oo, r A "^ p2 and q m p+ / _ p i ■ 

Remark 1.5. As n -> oo, m [ p ?f Ao \ and ^p-^<l. Note that if a = ft, CS 
degenerates to Bi(n,p) and m = in, so the upper bound of (1.5) becomes 0. 

Remark 1.6. Although the estimates in Theorem 1.1 are established for stationary X, 
since a Markov chain with transition matrix (1.1) and any initial distribution converges 
exponentially fast to the stationary distribution (see the coupling constructed in the proof 
of Lemma 2.4), our bounds can be adapted for approximating a Markov binomial dis- 
tribution with any initial distribution, provided that an error estimate for the difference 
between the Markov binomial distribution and CS is added to the upper bounds. 

2. Preliminary studies of the Markov binomial 
distribution 

To prove Theorem 1.1, we need the following preparation. 

Lemma 2.1. Suppose {Yj :j > 0} is a Markov chain with transition matrix (1.1) and 
Yq = 0. Define W = J27=i ^ e ^en have 

d T v(C(W),C(W + l))<i(n), (2.1) 

where 

j(x) := — = H for x > 0, 

y/X X 

and Ki and Ki are as given in Section 1. 
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Proof. We construct another version of the Markov chain {Yi}, denoted {Y(}, such that 
F(W + 1 yt W) < 7(n), where W = YJLi Y i ■ To this cnd > let Po = and for j > 1, let 
Pj = inf{i > pj-i : Y t ^ Y p t }. The {pj} are then stopping times separating the Markov 
chain into blocks of O's and l's. In other words, if we set £j = pj — pj-i — 1 for j > 1, 
then £i is the number of revisits of O's for the Markov chain before it moves to state 1, 
followed by £2 revisits of l's before it moves to 0, etcetera. By the regenerative theory 
(see Thorisson (2000), page 53), > 1} arc independent random variables, 

follows the geometric distribution with parameter a and has geometric distribution 
with parameter 1 — /3 for all j > 1. Wc write 

Hi = E£i = - — - , p, 2 = 



(1-/3)2- 

For fixed n, there are about — , " , „ blocks of O's and l's, so we let k = \ cn\ + 1 
with c close to (p,\ + /i2 + 2) . On the other hand, to further simplify the estimate 
in (2.6) below, it is convenient to take c = + P2 + 2) _1 . Let = £2.7-1 and 

ife = Sj=i?2j - Using Barbour and Xia (1999), Proposition 4.6, we have 

d TV (£(T k ),£(T k + l))< 



\J cn min(wi, 1/2) 

where m := 1 — e?Tv(£(£i)) £(£1 + 1)) = 1 — a. We then choose a maximal coupling 
(T fe ,T£ + 1) of C(T k ) and £(T fe + 1) (Barbour et al. (1992), page 254) such that 

d TV (C(T k ), C(T k + 1)) = F(T k ^n + l)< - j = = (2.2) 

yen mm(l — a, 1/2) 

and write {£ 2 j-i' 1 < i < k} for the i.i.d. random variables satisfying T' k = Y^j=i Csy-i- 
On the other hand, since {£,2j,j > 1} play exactly the same role as {^2j-i, j > 1} with 
and 1 swapped, there exists a maximal coupling (L k + l,L' k ) of C{L k + 1) and C{L k ) 
such that (L k ,L' k ) is independent of (T k ,T k ) and 

P(L k + 1 ^ 4) = d T v(£(L fe + 1), C{L k )) < .\ 01/0 , - (2-3) 

^/cnmin(p, 1/2) 

We write {£ 2 j' 1 < J < ^} f° r the i.i.d. random variables satisfying L' k = Y^j=i ^2j- 

Define p' Q = and ^ = + Q + 1, 1 < j < 2k. Wc now couple {Y?} with {FJ by 
setting 

for P2j-2 < i < P2J-1, 1 <3< k , 

Y{={ 1, for^._ 1 <i<^.,l<j<fc ) 
for i>p' 2k - 
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Under the conditions that p 2k < n, T k = T' k + 1 and Lk + 1 = £' fc , we have W — W + 1. 
Hence, 

P(W + 1 ^ VK') < P(p 2fc > n) + V(T k ? T' k + 1) + ¥(L k + 1 ^ L' k ). (2.4) 

Without loss of generality, wc may assume that cn > 8. In fact, if cn < 8, then ^= > 1 
and (2.1) clearly holds. Using Chebyshev's inequality, wc get 

Var(/9 2 fc) k(al + af) 



2k 



>n)< 



< 



[n-Ep 2k ) 2 (n-fc( m +^ 2 +2)) 2 
(cn+l)(of +of) 



< 



{n-(cn + l)(fi 1 +fi 2 + 2)) 2 
1.125cn(cr 2 + cr|) 



(2.5) 



< 



K 2 



(n - 1.125cn(^i + ^ 2 + 2)) 2 ~ ra' 
Finally, combining the estimates (2.2), (2.3) and (2.6) with (2.4) yields (2.1). □ 

Lemma 2.2. IfV&rS> ES, then /3>a. 

Proof. By (1.2), we have 

V&rS - ES = -np 2 + nA Q - A^l - {p - a) n ) 

= -np 2 + nA Q - A Q (l + [ft - a) + ((3 - a) 2 + • ■ ■ + {ft - a)"" 1 ) 
= -np 2 + A (n - (1 + 08 - a) + {fi - a) 2 + ■ ■ ■ + ((3 - a)™" 1 )). 

Clearly, n - (1 + (/? - a) + (/3 - a) 2 + ■ ■ ■ + (/3 - a)"" 1 ) > 0. If /3 < a, then A = 
2ot (iZp+ay> a ' > < Oj so VarS — ES* < 0, contradicting the assumption. □ 

Lemma 2.3. If h is a bounded function on Z+, and Vi, U 2 and U are T, + -valued random 
variables coupled in such a way that V is independent of (Vi, V 2 ), then 



v 2 -i 

E ^2 [h{Vx+j + V)-h{V)] 

3=0 



< 2e v \\h\\E 



U!U 2 + i(U 2 -l)U 2 



where e v := d TY (£(V), £(V + I)) ■ 

Proof. We write Ah(-) = h{- + 1) - h(-). Then, 



v 2 -i 

Ej2\h(V 1 +j + V)-h(V)] 

3=0 

/i 2 -l 



^ E PT + j + V) - h(V)} I (Vi,V 2 ) = (h,i 2 ) P((Vi , V 2 ) = (*i,i a )) 

ii,<a \i=0 / 
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ia-i 



< E E HHn+j + v) - /i(y)]|p((Vi,y 2 ) - (*i,i 3 )) 

ii,*a j=0 



EE 

»i.t2 i=o 



E ^ Ah(F + i) 

i=0 



s ((F 1 ,F 2 ) = (i 1 ,i 2 )) 



< 2^11^1 EE E P((^i.%) = (ii,i a )) 



:2ey||/i||E 



□ 



Lemma 2.4. WViie £(5 - X l \X l = j) := C{S l ' j ) for \ <i<n and j =0,1. If h is a 
bounded function on Z+, then 



{EhiS^-EhiS)] < \\h\\ WaV ^ Mn/4) + (aV[3)^), 

1-aVp 

\E[h(S hl ) - h(S u0 )} - E(S lA - S' lfi )EAh{S)\ 

< l|A.|| |a -f 5 ; v 2 y^ ( 7 (n/4) + (a V ^ ). 



(2.6) 
(2.7) 



Proof. We construct two copies of Markov chains having transition matrix (1.1), with 
one starting at state 1 and the other at state at time i in such a way that they can 
meet as soon as possible in both directions and, once they meet, they stay together 
from then on. To this end, we define a two-dimensional Markov chain {(Z^' , Z^' ), I > i} 
with state space {(0, 0), (0, 1), (1,0), (1, 1)}, initial state {Z\' , Z\' ) = (1,0) and transition 
probabilities 



r Poj/\pij, 

P(i,o)0'2,ji) =P(o,i)(juh) = \ P~ a ' 

la-/?, 



if ji =32=3, 

if f3>a,jx =0,j 2 = l, 

if j3< a, j x = l,j 2 = 0; 



P(i,i)U,j) ~ Pij 



for i, j = 0, 1. 



(2.8) 



Since the reverse chain X of X has the same transition matrix as that of X, we can 

Z\\Z\\l<i} oi{{Zi\zi 



construct a reverse chain {{Z\' , Z 1 ^ ), I < i} of {{Z\' 1 , Z^' ), I > i} in the same way as in 



(2.8). 



As i is fixed, we drop the subindex i and define 

S = min{t — i > : Z^' 1 = Zl' }, 



? = min{i - 1 > : Z t ' a = Z t J ' u } 
r = min{£ — i > : Z. 



ri,0 



0} 
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and 

f = min{i -t>0: Z\ x = Z\ fl = 0}. 
<, and c then have the same distribution, as do r and f. Moreover, 

¥(<;>m) = \l3-a\ m - 1 , P(r > m) < (j3 V a)" 1 " 1 , m > 1, 

E(f-l)=E(r-l)< /^" , (2.9) 
1 — p V a 

U, ^21/ (/3Va)(l+/3Va) 



E[(f-l) 2 ]=E[(r-l) 2 ]< 



(l-/?Va) 5 



(2.10) 



By (2.8) and the regenerative theory, the left range {(Z^ ,Zf' ) :l <i — f}, the mid- 
dle range {(Z^' 1 , Z\' a ) : I S [i — r,i + t]} and the right range {(Z, 1 ' , Z l f°) :l > i + t} arc 
independent. If we stipulate = for a < b and let 

i— r— 1 n 

s?= E z ?> E ^ 

(i+r)An (i+r)An 

c ,i = £ Z ;V c ,o = £ ^ 

j=(i-f)Vl,j/j j=(»— t)V1,j'^» 

then we can write 

S^Sj + Sj + C*' 1 , S ,4 '° = S* i 1 + S*; + C , °. (2.11) 

Let f/j := 5 ; l + 5* . We wish to estimate Si := d,Tv{£(Ui),£(Ui + 1)). Due to the symmetry 
about i of the Markov chain coupled, it suffices to estimate Si for i < j. By the definition 
of S l r and Lemma 2.1, 

drv(£(# + !),£(#)) 



ITV 



* E ^ + E # 

V \j=i+T+l / \j=i+T+l / 



< ^tv £ £ Zf + l U E Z f P(r = a)+P(r>n/4) 

a<n/4 V \j=i+a+l / \j=»+o+l / / 

<7(n/4)+P(r>n/4), 
which, because of the independence of Sj and S 1 *, ensures that 

e % < 7(n/4) + P(r > n/4) < 7 (n/4) + (a V /3) L " /4J . (2.12) 
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To compare S*' 1 , with S, we let {Y(} = {Z*' 1 } with probability p and {Y/} = {Z*' } 
with probability 1 - p so that {Y( :Q < I < n} has the same distribution as X. Next, 
replace {Y{ :l e [i — f , i + r] } with {Y;":/e [i — f, i + t] } , which has the same distribution 
as {Y/ : I G [i — f , i + r]}, but is independent of {(ZV ,Z l t ' ) : 1 < I < n}. Define 

{(i+r)An 

3,' e[i I r '^ T] '. and C= T Z[ 

Y,' I > i + t or I < i - r s ^ 1 

1 i=(i-f)Vl 

so that 5" := + 5*,', + C follows the distribution CS. By Lemma 2.3, we have 

mhis 1 - 1 ) - h(s')}\ 

< \E[h(Ui + C 1 ) - h(Ui)}\ + \E[h(Ui + C) - h(Ui)}\ 
<2e 4 ||/ l ||(EC' 1 +EC). 
However, it follows from (2.9) that 

E(c ,i v C i, 0) < E (r - 1) + E(f - 1) < . 2ay ? lR 

and 

EC < P E(r + f - 1) + (1 - p)E(r + f - 2) < 1 2aV ^. +p < 1 3a \f fl - (2.13) 

1 — a V p 1-aVp 



Therefore, 



lEIM^' 1 ) - < } Qa \f R \\h\\eu (2.14) 

1 — a V p 

which, together with (2.12), ensures (2.6). 

To estimate (2.7), noting that /3 > a implies C' 1 > C' , while /3 < a gives C M < C'° , 
and swapping and 1 in the superscripts, if necessary, we may assume without loss of 
generality that C' 1 > C'°- Observing that Ui is independent of (C' 1 — C'°, C)i we obtain 
from Lemma 2.3 that 

lE^S*' 1 ) - h{S' 1 - )} - E(5^ - S l -°)EAh(S)\ 

= \E[h(Ui + C' 1 ) - h(Ui + - HC 1 ~ C'°)EA/i(5')| 

< \E[h{Ui + C' 1 ) - h(U t + C fi )} - E(C U - C'°)EAh(Ui)\ 

(2.15) 

+ E(C M - C )M&HUi) - Ah{S')]\ 



< 



f ».-_ f ».u_! 

E [AHUi + C'O+ti-AhiUi)} 

3=0 



<2e l ||A/ l |NE 



2 £l ||A/ l ||E(C 1 -C'°)EC i 

+ E(C i!l -C'°)EC 
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Now, again using (2.9), we have 

E(C M _ C i, 0) < E(? _ 1} + E( - _ 1)= 2 ^-^i (2.16) 

1- |a-j9| 

E[(c ,i _ c ,o )(c a _ c ,o _ 1}] < E[(? + - _ 2)(? + -_ 3)] 

(2.17) 

6|a-/3| 2 
" (l-|a~/3|) 2 ' 

To estimate E(C'°(C 4,1 - C' )), for k = 0, 1, define 

(•+t)Ah i-1 
l=i+l i=(j— f)Vl 

and ri ; o = inf{i > 1 : + i = 0}. The conditional distribution of ti,o given = 1 is 
then the same as £(& + !)■ Since (C' 1 '" 1 ", £*' 0,+ ) and (^• 1 > _ ) ^'°' _ ) are independent, and, 
for convenience, we may assume that they are identically distributed, it follows that 

E[C°(C hl - C'°)] = E[(C'°'~ + <f .°>+)((M,+ + _ _ ^.o.-)] 

(2.18) 

= 2E(C'°' + )E(C' 1 ^ - T'°'~) + 2E[0 A+ (C' 1 ' + - 



On the other hand, 



E(C°' + )<E(t-1)< , (2.19) 

1 — p V a 

no 1 - C'°n < e(s i) = = 'T^Lr ( 2 - 2 °) 

l-|a-j8| 



(2.21) 



{/(i+?-l)An (i+T-l)An \ /(i+?-l)An 

E E ^° E ( z t A - z i 
\ l=i+l ;=(i+?)An / \ l=i+l 



i.Oi 



S 1.J 



S, z; ") 1 



{/(i+T-l)An \ l)An 
E 4'° 
\Z=(i+?)Ari / \ i=i+l 



E ( z i A - z i'°) 



l, = s) 



< -E[(, - I) 2 ] + E[n,o(s - 1)1^+, = l]F(^i = 1) 
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< jE[( ? -l) 2 ] + T ^E( ? -l) 

|qi- ^|(1.25 + 0.25a V/3) 
" (l-aV/3) 2 ' 

where 4 in the first inequality of (2.21) is due to the fact that a(b — a) < K- for all a and 
b. Combining (2.18)-(2.21) yields 

rtC' 1 -^)]^ ^^ . (2.22) 

(1 — a V p) z 

Therefore, collecting the estimates of (2.13), (2.16), (2.17) and (2.22), we obtain from 
(2.15) that 

\E[h(& 1 ) hiS*' )] - ECS*- 1 - S*-°)EAh(S)\ < SiWAhW W -. f l(5 + /3) , 

(1 — a V pj z 

which, together with (2.12), yields (2.7). □ 



3. Proofs of the main results 

Proof of (1.4). Set a = r(l - g) and 6 = 1 - g>. Let 

%(j) = (a + &J').9(j + l)-.7ff(j) 

be the Stein operator for the negative binomial distribution NB(r, q) (Brown and 
Xia (2001)). For A C Z + , let gA'-^+ —> K be the bounded solution of the Stein equa- 
tion 

BffO') = Mj^a} - NB(r, g)(A) for all j > 0. 

Then, 

d T v(£(S),NB(r,g))= sup |El {ieA} (S) -NB(r >9 )(A)| = sup |EB ffA (5)|. 

It hence remains to show that \KBgA(S)\ is bounded by the right-hand side of (1.4) for 
every A C Z+. For convenience, we drop the subindex A and write g for gA, and define 
S'(') = 9(- + !)• Brown and Xia (2001), Theorem 2.10, states that 

||A</||:=sup|A«/(j)|<-. (3.1) 

j G z+ a 
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Direct computation gives 



EBg(S) = aEg(S + 1) - (1 - 6)p^Eg(5 1 ' 1 + 2) + p^EA ff (S 1 ' 1 + 1) 

?:=i t=i 

n n 

= aEg'(S) - (1 - &)p^E</(S^ + 1) +p^EA 5 '(5 < - 1 ). 

i=l i=l 

Let 

a = n(l-6)p. (3.2) 

Then, 



E£ ff (S) = (1 - b)p 2 ^Eg'iS*' 1 + 1) + (1 - &)p(l -p) £ %'(^°) 

i=l i=l 
n n 

- ( i - b) P %' (s*' 1 + 1 ) + P J2 E A -9' 1 ) 



■6p(l-p)]£;EA 5 / (S i . 1 ) - (1 -6)p(l - P )J2ng'(S^) -g'iS^ )]. 



i=l i=l 

Set 

n[p 2 + 6p(l - p)] = (1 - 6)p(l - p) ]T E(5'^ - 



which is equivalent to 



Hence, we can write 

n 

EB ff (5) = [p 2 + 6p(l - p)} £ mg(S^) - Ag'(S)} 

1=1 

71 

- (1 - b)p(l -p^iWiS*' 1 ) -.g'(^ )] -E^ 1 - ^°)EA,/(S)} 
i=i 

Since a < /3 (see Lemma 2.2), we have 

p + 6(l-p) VarS'-ES' 2(j8-a) 
— pH < 



(3.4) 



1-6 1 ES ~ 1-aV (3 

so (1.4) follows from applying Lemma 2.4 and (3.1) in (3.4) and then collecting like terms. 



On approximation of Markov binomial distributions 
Finally, the constants a and b are determined by (3.2) and (3.3). 
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The proof of (1.5) is based on the Stein operator for the binomial distribution Bi(m, 9), 

Bg(j) = 6(m - j)g(j + 1) - (1 - 9)jg(j), j G Z+ 

(see Ehm (1991) or Barbour et al. (1992), page 188). The idea of the proof is similar to 
that in Soon (1996), but at the cost of a slight increase in complexity, we can achieve 
the better estimate (1.5). As Bi(m,(9) has support on {0,1,..., m} while S has support 
on {0,1,..., n} and it is possible that n > m, in estimating the distance between £5* 
and Bi(m, 9), one often needs to deal with S on {S > m + 1} separately. The following 
technical lemma helps us to avoid this issue. 



Lemma 3.1. For each A C Z +; there exists a bounded function g A on Z + such that 

Bg A (j) > l {jeA } - Bi(m, 9){A) for j e Z+ (3.5) 



l|Aff,4||< * . y (3.6) 
m0(l — 9) 

Proof. For < j < m, define <?a(j) as in Barbour et al. (1992), page 189, that is, 
gA(j),0< j < m, is the solution to the Stein equation 

Bg A (j) = l {jeA} -Bi(m,6)(A), 0<j<m. (3.7) 

For j>m + l, let 

l-9Bi(m,0)(A) 



m9(\-9) 
l + 9-9Bi(m,9){A) 



9aU) = 

m9(l - 

Direct verification then ensures that 



if m £ A. 



Bg A (j) 



= l {jeA} -Bi(m,9)(A), if0<j<m, 
>l-Bi(m,9)(A), ifj>m+L 



which, in turn, implies (3.5). Using (3.7) with j = m, we conclude that 

Bi(m,0)(A) 



9A{m) 



m{\~9) ' 
-l + Bi(m,9)(A) 
m{\ - 9) ; 



if m £ A, 
if me A. 
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Thus, 

\^9A(j)\ = \rnO{i-ey lfj = m ' 

[o, ifj>m + l. 

The claim (3.6) follows easily from the proof of Lemma 9.2.1, Barbour et al. (1992). □ 

Proof of (1.5). Let Aq := {i : P(S = i) > Bi(m, #){«}} and abbreviate gA to g. From 
Lemma 3.1, we have that 

drv(CS,Bi(m,e)) =P(5 G Aq) -Bi(m,0)(A o ) <EBg(S). (3.8) 

Therefore, it remains to show that EBg(S) is bounded by the right-hand side of (1.5). 
To this end, 

EBg(S) = 0E[(m - S)g(S + 1)] - (1 - 9)E[Sg(S)} 
= mOE[g{S + 1)] - 9E[SAg(S)] - E[Sg(S)} 

n n 

= Pip- 0) ^EAgis*- 1 + 1) - p(i- P )J2ng(S lA + 1) - g(S lfi + 1)] 

i=l i=l 
n 

= Pip - 0) ^[EA.g(5 M + 1) - EAg(S + 1)] (3.9) 

i=l 

n 

- P (l - p) J^MgiS*' 1 + 1) - 9(S ifi + 1)] - E(5 M - S^°)EAg(S + 1)} 

i=l 

+ (np(p -6)+ p(l - p) ^E(5*'° - S 1 ' 1 )^ EA ff (S + 1) 

:=h + h + h- 
By Lemma 2.4 and (3.6), we have 

|J 1 |<10J£z|L. T 4^_( 7 („/4) + (aV ) 8)L«/*J ) 



and 

l-p|a-^|(5 + 23aV/3) 
1-6 

which, in turn, ensure that 



N i ' (i- av ;y ^ (7(n/4) + (c»V<i)K.J), 



W + N£ (^ Cl + ^)(^ + ^ + ( ,V«,^). (3.0, 
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To estimate -Z3, setting e = fh — to, we get 



n 



np(p - 9) +p{l -p)^2 E(S h0 - S*' 1 ) 



= \np(p — 6) + np(l — p) — VarS*! 



m fh — e 




Therefore, recalling ||A<7|| < 



1 



and to > m 



we arrive at 



me(i-e) 



|/ 3 |<^||A 5 ||0 2 e< 



6e 6 2 e 



(3.11) 



to(1 -6)~ np(l - 6) ' 



The proof is completed by combining the estimates (3.8)-(3.11). 



□ 
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